Script to analyze the number agreement errors made by Spanish native speakers in a web-based pronoun production experiment (Experiment 1: number agreement). Created May 2022, by Sol Lago and Nasimeh Bahmanian.
2 Modeling issues
In the majority of the models, we had to simplify due to convergence issues. But even in the simplified models, we are facing singularity issue. I know that we should not simply ignore singularity warnings, but we decided not to simplify our models further. So, in addition to how to resolve convergence issues without oversimplifying a model, I would like to discuss singularity during the summer school and better understand the consequences of ignoring it.
3 Background
In this experiment participants were asked to describe scenes of moving objects by producing sentences (in Spanish) like the English example below:
The antecedent and the attractor either matched or mismatched in number. We are interested to see if responses in mismatch conditions differs from match conditions (1) in terms of accuracy, i.e. whether participants made agreement errors in producing the pronoun. (2) in term of latency in planning/production of the pronoun in error-free responses. For the latter, we are looking at the duration of critical regions as well as the likelihood of a non-zero pause immediately before the critical region.
4 ReadME
Some of the variables
‘TargetResponseIdentical’
‘ResponseComplete’
‘NumberError’
‘GenderError’
‘OtherDifferences’
are only needed to create the two dataframes: accuracy data and latency data.
For the stat, depending on the analysis, I need different sets of variables:
Analysis of errors
‘NumberError’
‘Match’
‘AntecedentNumber’
Analysis of pauses
‘Segment’
‘Pause’
‘Match’
‘AntecedentNumber’
Analysis of durations
‘Segment’
‘Duration’
‘Match’
‘AntecedentNumber’
‘Structure’
‘Syllable’
‘NounGender’
5 Setup
#library(summarytools) # data frame visualization and summarylibrary(binom) # binomial confidence intervals library(tidyverse) # data manipulation and plotting
library(ggridges) # density plotslibrary(scales) # scales for plots
Attaching package: 'scales'
The following object is masked from 'package:purrr':
discard
The following object is masked from 'package:readr':
col_factor
Loading required package: Matrix
Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':
expand, pack, unpack
library(car) # plot residuals
Loading required package: carData
Attaching package: 'car'
The following object is masked from 'package:dplyr':
recode
The following object is masked from 'package:purrr':
some
library(MASS) # boxcox
Attaching package: 'MASS'
The following object is masked from 'package:patchwork':
area
The following object is masked from 'package:dplyr':
select
library(Hmisc) # descriptive statistics
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Attaching package: 'Hmisc'
The following objects are masked from 'package:dplyr':
src, summarize
The following objects are masked from 'package:base':
format.pval, units
accuracydata <-mutate(productiondata) %>%# Keep one row per trial. dplyr::select(!("Segment":"Syllable")) %>%unique() %>%# Exclude incomplete responses.filter(ResponseComplete =="yes") %>%# Exclude responses with gender errors or distorted preambles.filter(!(GenderError %in%c("yes"))) %>%filter(!(OtherDifferences %in% EXCLUDE_ACCURACY)) %>%# Convert number errors to a numeric variable (0 = correct; 1 = error).mutate(NumberError =ifelse(NumberError =="no", 0, 1)) %>%# Re-order factor levels.mutate(Condition =factor(Condition, levels =c("ss", "sp", "pp", "ps"))) %>%mutate(AntecedentNumber =factor(AntecedentNumber, levels =c("singular", "plural"))) %>%droplevels()
6.5 Create data frame for latency analysis
latencydata <-mutate(productiondata) %>%# Exclude incomplete responses.filter(ResponseComplete =="yes") %>%# Exclude responses that are not identical to the target sentences.filter(TargetResponseIdentical ==TRUE) %>%# Re-order factor levels.mutate(Condition =factor(Condition, levels =c("ss", "sp", "pp", "ps"))) %>%mutate(AntecedentNumber =factor(AntecedentNumber, levels =c("singular", "plural"))) %>%mutate(Segment =factor(Segment, levels =c("onset","antecedent","verb","attractor","adverb","pronoun","attractor_pronoun"))) %>%droplevels()
6.6 Outlier exclusions for latency data
Outlier detection is done using the IQR method described here.
According to this method, outliers are all observations above Q75 + 1.5 IQR or below Q25 - 1.5 IQR (where Q25 and Q75 correspond to the first and third quartile, and IQR is the difference between them).
6.7 Visualize outliers according to the IQR method.
ggplot(latencydata, aes(x = Condition, y = Duration)) +geom_boxplot()
plot_duration_all <- latencydata %>%# Exclude utterance onset.filter(!Segment %in%c("onset", "attractor_pronoun")) %>%# Separate critical region from other segments.mutate(SegmentType =factor("All regions")) %>%# Compute mean durations by participant.group_by(Participant, SegmentType, Segment, Condition, AntecedentNumber, Match) %>%summarise(.groups ="keep", Mean =mean(Duration, na.rm =TRUE)) %>%# Draw plot. {ggplot(., aes(x = Segment, y = Mean, group = Condition, color = Condition)) +# Plot by-condition means and standard error bars.stat_summary(fun = mean, geom ="path", aes(linetype = Condition), size =0.5, position = dodge, na.rm =TRUE) +stat_summary(fun.data = mean_se, geom ="errorbar", size =0.5, width =0, position = dodge, show.legend =FALSE) +stat_summary(fun = mean, geom ="point", shape =18, size =2, position = dodge, show.legend =FALSE) +# Place utterance onset and sentence segments in different facets.facet_grid( ~ SegmentType) +# Customize axes and scales.labs(x =NULL, y ="Duration [ms]") +scale_x_discrete(labels = label_regions) +scale_color_manual(values = color_cond, labels = label_cond) +scale_linetype_manual(values = linetype_cond, labels = label_cond) +# Customize theme.theme_light() +theme(legend.position ="top") +theme(text =element_text(size =10, colour ="gray28")) +theme(axis.text.x =element_text(size =7, vjust =0.5, margin =margin(10,0,0,0))) +theme(strip.text.x =element_text(size =10))}
7.5 Post-attractor region
plot_duration_critical <- latencydata %>%# Select only post-attractor combined region.filter(Segment =="attractor_pronoun") %>%# Separate critical region from other segments.mutate(SegmentType =factor("Adverb + pronoun region")) %>%# Compute mean durations by participant.group_by(Participant, SegmentType, Segment, Condition, AntecedentNumber, Match) %>%summarise(.groups ="keep", Mean =mean(Duration, na.rm =TRUE)) %>%# Add by-condition means to the data frame.group_by(Condition) %>%mutate(MeanCondition =mean(Mean)) %>%# Draw plot. {ggplot(., aes(x = Condition, y = Mean, color = Condition)) +# Draw boxplots.geom_boxplot(size =0.5, outlier.shape =NA) +# Draw by-participant means.geom_point(size =1.5, alpha =0.5, stroke =0, position = jitter) +# Draw by-condition means.stat_summary(fun = mean, geom ="point", shape =18, size =4) +geom_label(data =unique(dplyr::select(., c(Condition, MeanCondition, AntecedentNumber))),aes(x = Condition, y = MeanCondition, label =round(MeanCondition, 0)),size =3.5, nudge_y =23, label.padding =unit(.15, "lines"), alpha =0.5, label.size =NA) +# Facets for each segment.facet_grid(. ~ SegmentType) +# Customize axes and scales.labs(x =NULL, y =NULL) +scale_color_manual(values = color_cond) +scale_x_discrete(labels = label_cond) +# Customize theme.theme_light() +theme(legend.position ="none") +theme(text =element_text(size =10, colour ="gray28")) +theme(strip.text.x =element_text(size =10)) }
7.6 Concatenate and save
theme_custom <-theme_light() +theme(legend.position ="top")theme_set(theme_custom)jpeg("./figures/durations_by_region.jpeg", units ="cm", width =16, height =12, res =300)(plot_duration_all | plot_duration_critical) +plot_layout(widths =c(1.8, 1)) +plot_layout(guides ="collect")dev.off()
7.7 Likelihood of pauses by condition
average_pauses <-mutate(latencydata) %>%# Keep only pause data.filter(!is.na(Pause)) %>%# Compute averages by condition and segment.group_by(Segment, Condition) %>% dplyr::summarise(.groups ="keep",NPause =sum(Pause),N =length(Pause),Mean = NPause / N,CIlow =as.numeric(binom.confint(NPause, N, methods ="agresti-coull")["lower"]),CIhigh =as.numeric(binom.confint(NPause, N, methods ="agresti-coull")["upper"]))# Plotting parametersjitter <-position_jitter(width =0.2, height =0, seed =0)# Plotplot_pause <- latencydata %>%# Keep only pause data.filter(!is.na(Pause)) %>%# Compute mean durations by participant.group_by(Participant, Segment, Condition, AntecedentNumber) %>%summarise(.groups ="keep",NPause =sum(Pause),N =length(Pause),Mean = NPause / N) %>%# Add by-condition means to the data frame.group_by(Segment, Condition) %>%mutate(MeanCondition =mean(Mean, na.rm =TRUE)) %>%# Draw plot. {ggplot(., aes(x = Condition, y = Mean, color = AntecedentNumber)) +# Draw by-participant means.geom_point(size =0.9, alpha =0.3, stroke =0, position = jitter) +# Draw group means and labels.stat_summary(fun = mean, geom ="point", shape =18, size =2) +geom_label(data =unique(dplyr::select(., c(Segment, Condition, MeanCondition, AntecedentNumber))),aes(x = Condition, y = MeanCondition, label =round(MeanCondition, 3) *100),size =2, nudge_y =0.025, label.padding =unit(.15, "lines")) +# Facets for each segment.facet_wrap(. ~ Segment, ncol =3, scales ="free") +# Customize axes and scales.labs(x =NULL, y ="Pause likelihood") +scale_color_manual(values =c("#3C425A", "#CC9A41")) +scale_x_discrete(labels =c("SS", "SP", "PP", "PS")) +scale_y_continuous(labels = scales::percent_format(accuracy =1)) +# Customize theme.theme_light() +theme(legend.position ="none") +theme(text =element_text(size =8, colour ="gray28")) +theme(strip.text.x =element_text(size =8)) }plot_pause
# Save plot.# ggsave("./figures/pauses.jpeg", plot_pause, units = "cm", height = 10, width = 12, dpi = 300)
8 Contrasts
AntecedentNumber: sum coded (-1 singular, +1 plural).
Match: sum coded (-1 match, +1 mismatch)
Structure: sum coded (-1 no_preposition, +1 with_preposition)
RK: The recommendation is the stay with digits.
# contrasts for accuracy datacontrasts(accuracydata$AntecedentNumber) <--contr.sum(2) contrasts(accuracydata$Match) <--contr.sum(2) contrasts(accuracydata$Structure) <--contr.sum(2) # contrasts for latency data.contrasts(latencydata$AntecedentNumber) <--contr.sum(2) contrasts(latencydata$Match) <--contr.sum(2) contrasts(latencydata$Structure) <--contr.sum(2)
9 Check distribution
RK: I would recommend to do this much earlier. Possibly even before the outlier detection – unless they are clearly unreasonable.
The Box-Cox procedure is used to determine the best transformation (adapted from Osborne, 2010). + if 0 =< λ < 1, log transformation; + if -1 < λ < 0, reciprocal; + if λ = 1, no transformation needed.
RK: Hm?
10 Data transformation
10.1 Adverb + pronoun region
bc <-boxcox(lm(Duration ~1, data =filter(latencydata, Segment =="attractor_pronoun")))
bc <-data.frame(cbind(lambda = bc$x, lik = bc$y))bc <- bc[bc$lik ==max(bc$lik), "lambda"]
lambda = 0.51 so log transformation is used.
RK: Square-root would be an option, too.
10.2 Pronoun region
bc <-boxcox(lm(Duration ~1, data =filter(latencydata, Segment =="pronoun")))
bc <-data.frame(cbind(lambda = bc$x, lik = bc$y))bc <- bc[bc$lik ==max(bc$lik), "lambda"]
lambda = 0.83 so log transformation is used).
RK: I would not transform.
10.3 Onset region
bc <-boxcox(lm(Duration ~1, data =filter(latencydata, Segment =="onset")))
bc <-data.frame(cbind(lambda = bc$x, lik = bc$y))bc <- bc[bc$lik ==max(bc$lik), "lambda"]
lambda = 0.47 so log transformation is used).
RK: Square-root would be an option, too.
RK: I definitely would not use different transformations for different regions. My impression is that staying with the original metric would be ok.
11 (G)LMMs
11.1 GLMM for accuracy: AntecedentNumber * Match
NB: Random effects simplified due to non-convergence) RK:
The mpdel is not correctly specified. One problem is that you must not use factors with || in RES.
The specification may cause false-positive convergence errors.
Antecedent and Match look like within-Subject/Item factors.
m_accuracy <- glmer(NumberError ~ AntecedentNumber * Match
+ (1 + AntecedentNumber + Match | Participant)
+ (1 + AntecedentNumber + Match * Item),
Here random effects were simplified due to non-convergence, but we may just as well start with the so-called maximal LMM.
f1 <-"./fits/m_acc.rda"if(!file.exists(f1)){m_acc <-glmer(NumberError ~1+ AntecedentNumber * Match + (1+ AntecedentNumber * Match | Participant) + (1+ AntecedentNumber * Match | Item), family = binomial,data = accuracydata,control =glmerControl(calc.derivs=FALSE))save(m_acc, file=f1)} elseload(f1)summary(rePCA(m_acc))
$Item
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 1.3724 0.9706 0.05705 0.02759
Proportion of Variance 0.6657 0.3329 0.00115 0.00027
Cumulative Proportion 0.6657 0.9986 0.99973 1.00000
$Participant
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 1.1118 0.6247 0.14893 0.02292
Proportion of Variance 0.7496 0.2366 0.01345 0.00032
Cumulative Proportion 0.7496 0.9862 0.99968 1.00000
RK: This model looks like it is supported by the data. Some of the CPs are of suspicously large magnitugde. We will use MixedModels.jl to check whether they are significant different from zero.
11.2 GLMM for accuracy: AntecedentNumber / Match
RK: A nested model may be a good a priori or post-hoc alternative LMM. Usually, it makes more sense to use the same formula in the fixed-effect and RES part of the LMM – if such a complex RES is supported by the data.
f2 <-"./fits/m_acc_nested.rda"if(!file.exists(f2)){m_acc_nested <-glmer(NumberError ~1+ AntecedentNumber / Match + (1+ AntecedentNumber / Match | Participant) + (1+ AntecedentNumber / Match | Item), family = binomial,data = accuracydata,control =glmerControl(calc.derivs=FALSE))save(m_acc_nested, file=f2)} elseload(f2)summary(rePCA(m_acc_nested))
$Item
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 1.4042 0.9526 0.03146 0.01630
Proportion of Variance 0.6845 0.3150 0.00034 0.00009
Cumulative Proportion 0.6845 0.9996 0.99991 1.00000
$Participant
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 1.0521 0.6150 0.33061 0.02907
Proportion of Variance 0.6939 0.2371 0.06851 0.00053
Cumulative Proportion 0.6939 0.9310 0.99947 1.00000
We do get convergence failure for the nested LMM. This strongly suggests that the first LMM may not be supported as well as it looks. We check with MixedModels.jl.
11.3 LMM for duration for adverb + pronoun region: AntecedentNumber * Match
# Save plot.# ggsave("vincentiles.jpeg", plot_vincentiles, units = "cm", # height = 12, width = 16, dpi = 300)
17 RK extensions
17.1 Save dataframes for transfer to Julia
library(arrow)
Attaching package: 'arrow'
The following object is masked from 'package:utils':
timestamp
dat_acc <- accuracydata |>as_tibble() |> dplyr::select(Subj=Participant, Item, Condition, Strct=Structure, Match, ACN=AntecedentNumber, nerr=NumberError) |>mutate(Subj =factor(paste0("S", str_pad(str_remove(Subj, "s"), width =2, side ="left", pad ="0"))),Item =factor(paste0("I", str_pad(Item, width =3, side ="left", pad ="0"))))write_feather(dat_acc, "./data/Bahmanian_acc.arrow")dat_dur <- latencydata |>as_tibble() |> dplyr::select(Subj=Participant, Item, Condition, Strct=Structure, Match, ACN=AntecedentNumber, Segment, dur=Duration, pause=Pause) |>mutate(Subj =factor(paste0("S", str_pad(str_remove(Subj, "s"), width =2, side ="left", pad ="0"))),Item =factor(paste0("I", str_pad(Item, width =3, side ="left", pad ="0"))))write_feather(dat_dur, "./data/Bahmanian_dur.arrow")
17.2 Boxcox
library(arrow)library(tidyverse)dat <-read_feather("./data/Bahmanian_dur.arrow")dat_dur <- dat |>filter(!is.na(dur))MASS::boxcox(dur ~1+ Match + ACN + Subj, data=dat_dur) # reciprocal value
dat_pronoun <- dat_dur |>filter(Segment =="pronoun") MASS::boxcox(dur ~1+ Match + ACN + Subj, data=dat_pronoun) # original value
dat_onset <- dat_dur |>filter(Segment =="onset")MASS::boxcox(dur ~1+ Match + ACN + Subj, data=dat_onset) # log-transformed value
Depending on which segment we look at, we get different proposals; not quite sure what to recommend in this case. Reciprocal is out. We could tap into different processes for pronoun and onset segments, requiring different metrics. However, if at all possible, I would not change metrics across analyses for the same data, because this is very difficult to communicate. The log-transform sounds like a reasonable choice, not the least because that’s usually required for all kinds of fixation measures.